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Abstract 

The present paper presents four methods for determining the number of factors to retain 
(eigenvalue greater than unity (Kl), scree test, minimum average partial, and parallel 
analysis). Three of the four methods are illustrated by means of an example. Although the 
eigenvalue greater than unity and scree test are the most widely used methods, the 
parallel analysis and the minimum average partial are the most accurate methods. 
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Determining the Number of Factors to Retain 
Suppose a researcher has gathered four predictors and wishes to predict some 
outcome (dependent variable). There are six simple correlations among the variables. 
These six correlations are shown as the off-diagonal elements in the correlation matrix in 
Table 1 . 

Insert Table 1 About Here 

Conceivably, the researcher might be able to visually inspect the correlation matrix and 
find a pattern or arrive at some conclusion. For example, after visually inspecting the 
correlation matrix in Table 1, the researcher may conclude that although the correlations 
only range from 0.377 to 0.535, these correlations may be noteworthy depending on the 
particular theory being tested. For example, while a low correlation might be important in 
the medical field, the same low correlation might not be important in the education field. 
However, as the number of predictors increases, the number of simple correlations 
increases much faster. In fact, for n predictors, there are n(n-l)/2 simple correlations. 
Thus, if the researcher had gathered 12 predictors, there would be 66 (i.e., 12(12-l)/2 = 
12(1 1)/2 = 132/2 =66) simple correlations! These 66 simple correlations are shown as the 
off-diagonal elements in the correlation matrix in Table 2. 

Insert Table 2 About Here 



Clearly, no researcher can visually inspect such a correlation matrix and come up with a 
pattern or a conclusion. Consequently, “Some means is needed for determining if there 
are a small number of underlying constructs which might account for the main sources of 
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variation in such a complex set of correlations” (Stevens, 1996, p. 362). Another reason 
for wanting to reduce the number of variables might be that perhaps every predictor is not 
measuring a different construct. In other words, 

Suppose that we have administered 100 different test of ability and 
school attainment. In fact, the resulting correlation matrix would consist 
of positive and often high correlations in the region of 0.5 to 0.6. A factor 
analysis would reveal that these could be accounted for by a small number 
of factors: intelligence, verbal ability, and spatial ability. Thus instead of 
having to look at the scores on 100 tests to understand these correlations, 
which no human being is able to do, we could understand them in terms 
of three scores- on intelligence, verbal ability, and spatial ability. 

(Kline, 1994, p. 5) 

Two methods (a) principal components analysis and (b) factor analysis are 
commonly used in dealing with this problem. Of the two, “components analysis is the 
most widely used” (Zwick & Velicer, 1982, p. 253). As Stevens (1996) has noted, “In 
factor analysis a mathematical model is set up, and the factors can only be estimated, 
whereas in component analysis we are simply transforming the original variables into the 
new set of linear combinations (the principal components)” (p. 362). 

The purpose of this paper is to present four methods for determining the number 
of factors to retain. These four methods are (a) the eigenvalue greater than unity (Kl); (b) 
the scree test; (c) the minimum average partial (MAP); and (d) the parallel analysis (PA). 
To illustrate how to apply three of the four methods, a data set originally analyzed by 
Holzinger and Swineford (1939) will be used. This data set was collected by 
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administering 24 psychological tests to junior high school students. According to Hetzel 
(1996), “This data set has frequently been used by researchers explaining various analytic 
techniques and is representative of the ability that have been used throughout the history 
of factor analysis” (p. 177). 

Four Methods of Extraction 
Eigenvalue Greater than Unity (XI) Method 

Zwick and Velicer (1982) noted that “The most commonly employed rule for 
determining the number of components is to retain those components with eigenvalues 
greater than 1 .0” (p. 254). This rule, also known as the K1 rule, was developed by Kaiser 
(1960) but can be traced to Guttman (1954). This method is very simple, objective, and 
easy to use. Moreover, as pointed out by Thompson and Daniel (1996), “This extraction 
rule is the default option in most statistics packages and therefore may be the most widely 
used decision rule, also by default” (p. 200). 

Eigenvalues are an index of variance explained and can range from one to the 
number of original variables (when variables are being factored) (Hetzel, 1996, p. 187). 
Table 3 presents the eigenvalues for the factors for the correlation matrix. According to 
this table, only the first three components have eigenvalues greater than unity (i.e., the 
eigenvalue for component one is 2.945, the eigenvalue for component two is 1.760, and 
the eigenvalue for component three is 1.396). Therefore, using the K1 method, the 
researcher would only retain the first three components for further analysis. 

Insert Table 3 About Here 
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The percentage of variance accounted for is found by dividing the eigenvalue by 
the number of variables in the analysis. For this example, the number of variables is nine. 
Thus, the variance accounted for by (a) the first component is 32.727 (i.e., 2.945/9 = 
32.727); (b) the second component is 19.559 (i.e., 1.760/9 = 19.559); (c) the third 
component is 15.515 (i.e., 1.396/9= 15.515), and so on. However, since only those 
components with eigenvalues greater than unity are to be retained, only the first three 
components are extracted. Therefore, the total variance accounted for using the K1 
method is 67.8 % (i.e., 32.727 + 19.559 + 15.515). 

Communality coefficients are also an index of variance accounted for but are 
expressed as percentages. More specifically, “the communality of a variable is that 
proportion of its variance that can be accounted for by the common factors” ( Gorsuch, 
1983, p. 29). For example, if the communality were 0.809, the variance of the variable as 
reproduced from only the common factors would only be 80.9% of its observed variance. 
The value 0.809 was obtained by summing the square of each of the three components for 
variable one (i.e., 0.776 2 + (-0.449) 2 + 0.068 2 = 0.809). The rest of the communalities are 
calculated in a similar fashion. Table 4 presents the values of the components used to 
calculate the communalities. The communalities are shown in Table 5. 

Insert Tables 4 and 5 About Here 



Studies by a number of researchers (Horn, 1965; Zwick & Velicer, 1982; Zwick 
& Velicer 1986) have evaluated the accuracy of the eigenvalue greater than unity 
criterion. These researchers found that the number of components retained by the K1 
method is often an overestimate. However, Stevens (1996) noted that 
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Generally the criterion was accurate to fairly accurate, with 
gross overestimation occurring only with a large number of 
variables (40) and a low communalities (around .40). The 
criterion is more accurate when the number of variables is 
small (10 to 15) or moderate (20 to 30) and the communalities 
are high (>.70). (p. 366) 

Other researchers have stated that the K1 method may sometimes lead to the 
retention of fewer factors than should have been retained. For example, Humphreys 
(1964) concluded that when more components than would have been extracted by the K1 
method were subsequently rotated, the results were more meaningful. 

In summary, the K1 rule, although commonly used, is believed by some 
researchers to sometimes underestimate and by many others to grossly overestimate the 
number of components. Moreover, as pointed out by Zwick and Velicer (1986), 

The use of the K1 rule as the default value in some of the standard 
computer packages (BMDP, SPSS, SAS) is an implicit endorsement 
of the procedure, particularly to naive users. This pattern of explicit 
endorsement by textbook authors and implicit endorsement by 
computer packages, contrasted with empirical findings that the 
procedure is very likely to provide a grossly wrong answer, seems 
to guarantee that a large number of incorrect findings will continue 
to be reported (p. 439). 
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Scree Test Method 

A second approach for determining the number of factors to retain is the scree test 
proposed by Cattell (1966). The scree test, available on SPSS and SAS, is based on a 
graph of the eigenvalues. First, plotting the eigenvalues along the ordinate (y-axis) and 
the component numbers along the abscissa (x-axis) construct a graph. The graph begins 
with a steep curve and then plateaus into an almost straight line. According to Cattell 
(1966), 

This straight end portion we began calling the scree-from the straight 
line rubble and boulders which forms at the pitch of sliding stability 
at the foot of a mountain. The initial implication was that this scree 
represents a “rubbish” of small error factors, (p. 249) 

The scree plot for the data being analyzed is shown in Figure 1 . Notice that the 
inclination of the first three components is very steep whereas an almost flat line joins the 
rest of the components. Thus, there is a break point in the plot. This break point is used to 
determine which factors to retain. Specifically, retain all those factors that are located 
before the break point. According to Figure 1, the break point is at component four. Thus, 
the first three components are to be retained for further analysis. Therefore, using this 
data set, the K1 method as well as the scree test suggested to retain only the first three 
components. 

A basic idea behind this selection process is that the variables measure a few 
factors well and a large number of factors much less well. Thus, the predominant factors 
are large and account for most of the variance whereas the other factors are small, 
numerous, and account for less variance. 
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Complications when using the scree test include: (a) a gradual slope from lower to 
higher eigenvalues with no obvious break point in the plot; (b) more than one break point 
in the plot; and (c) more than one apparently suitable line may be drawn through the low 
values. Nonetheless, a number of researchers have found the test to be accurate in a 
majority of cases investigated. Another complication when using the scree test is the 
interrater reliability. On this, Cattell and Vogelmann (1977) have shown high interrater 
reliability. But Crawford and Koopman (1979) have reported extremely low interrater 
reliabilities. 

Studies by Tucker, Koopman, and Linn (1969) found that the scree test gave the 
correct number of factors in 12 of 18 cases. Similarly, Linn (1968) found the scree test to 
give the correct number of factors in 7 of 10 cases. 

Zwick and Velicer (1982) found the scree test to be the most accurate of four 
methods evaluated across many examples of matrices of known, noncomplex structure. 
However, four years later the same Zwick and Velicer (1986) concluded that “given these 
drawbacks and the availability of other clearly superior methods, we can no longer 
recommend the scree test as the method of choice for determining the number of 
components in PCA” (p. 440). 

Minimum Average Partial (MAPI Method 

Another decision rule for factor retention is the Minimum Average Partial (MAP) 
rule, proposed by Velicer (1976), which is based on the matrix of partial correlations. 
According to Velicer (1976), the method is exact, can be applied with any covariance 
matrix, and is logically related to the concept of factors representing more than one 
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variable. Moreover, the MAP method is expected to produce fewer components than the 
K1 rule. 

Zwick and Velicer (1982) concluded that “A relatively recently introduced 
method, MAP has not been examined systematically to date” (p. 257). However, Zwick 
and Velicer (1986) reported that the MAP rule was more accurate and less variable than 
the Kl, Bartlett, or scree methods. Moreover, Zwick and Velicer (1986) concluded 
“Researchers wishing to ignore relatively small major components should use MAP as a 
primary method of determining the number of components to retain” (p. 440). 

Parallel Analysis 

Parallel analysis, conceptualized by Horn (1960), involves the factoring of a 
second matrix, identical with respect to the number of variables (n) and number of 
observations (N) as the original matrix, but formed from distributions of random-normal 
deviates. For example, if one had l-to-5 Likert scale data for 301 subjects and 9 
variables, a 301-by-9 raw data matrix consisting of Is, 2s, 3s, 4s, and 5s would be 
generated. Then, the random data matrix is factor analyzed and the corresponding 
eigenvalues computed. These newly computed eigenvalues are compared to with those 
obtained from the original data. For any real eigenvalue that exceeds the associated 
eigenvalue from the random data, its factor is extracted. For example, if the second 
eigenvalue from the real data is 1.760 and the associated eigenvalue from the random 
data is 1.274, the second factor would be extracted. On the other hand, if the fourth 
eigenvalue from the real data is 0.717, and the associated eigenvalue from the random 
data is 1.021, then the fourth factor would not be extracted. Table 3 presents the 
eigenvalues from the real data. Table 6 presents the eigenvalues from the random data. 
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Insert Table 6 About Here 



Humphreys and Montanelli (1975) stated that “the parallel analysis criterion is 
superior to maximum-likelihood as a method for deciding on the number of common 
factors” (p. 201). Moreover, Humphreys and Ilgen (1969) stated that “the technique is 
worthy of consideration for use in making the number of factors decision” (p. 578). 

Zwick and Velicer (1986) concluded “the PA method was the most frequently accurate 
method examined” (p. 440). Although “the general application of the PA method is 
problematic at this time because programs needed for its application are not widely 
available” (Zwick & Velicer, 1986, p. 441), Thompson and Daniel (1996) provide an 
SPSS program that implements the analysis. 

Conclusion 

This paper presented four methods for determining the number of factors to retain 
(eigenvalue greater than unity (Kl), scree test, minimum average partial, and parallel 
analysis). Three of the four methods wee illustrated by means of an example using the 
data set from Holzinger and Swineford (1939). All four methods reduce a large number 
of predictors to a small number of factors. Of the four methods, the eigenvalue greater 
than unity (Kl) and the scree test methods are the most widely used, probably because 
these methods are the default in most statistics packages. However, the minimum average 
partial and the parallel analysis methods are the most accurate methods. 
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Table 1 

Correlation Matrix when n=4 



Correlation 


XI 


X2 


X3 


X4 


XI 


1.000 


0.390 


0.395 


0.457 


X2 


0.390 


1.000 


0.377 


0.470 


X3 


0.395 


0.377 


1.000 


0.535 


X4 


0.457 


0.470 


0.535 


1.000 
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Table 2 

Correlation Matrix when n=12 



r 


XI 


X2 


X3 


X4 


X5 


X6 


X7 


X8 


X9 


XI 0 


XII 


XI 2 


XI 


1.000 


0.297 


0.365 


0.441 


0.295 


0.373 


0.293 


0.331 


0.357 


0.067 


0.286 


0.224 


X2 


0.297 


1.000 


0.238 


0.340 


0.150 


0.153 


0.139 


0.184 


0.193 


-0.076 


0.108 


0.092 


X3 


0.365 


0.238 


1.000 


0.305 


0.218 


0.212 


0.173 


0.212 


0.239 


0.040 


0.126 


0.177 


X4 


0.441 


0.340 


0.305 


1.000 


0.100 


0.159 


0.077 


0.171 


0.198 


0.072 


0.199 


0.186 


X5 


0.295 


0.150 


0.218 


0.100 


1.000 


0.657 


0.716 


0.637 


0.739 


0.175 


0.316 


0.165 


X6 


0.373 


0.153 


0.212 


0.159 


0.657 


1.000 


0.733 


0.582 


0.704 


0.174 


0.342 


0.107 


X7 


0.293 


0.139 


0.173 


0.077 


0.716 


0.733 


1.000 


0.674 


0.720 


0.102 


0.300 


0.139 


X8 


0.331 


0.184 


0.212 


0.171 


0.637 


0.582 


0.674 


1.000 


0.582 


0.132 


0.313 


0.184 


X9 


0.357 


0.193 


0.239 


0.198 


0.739 


0.704 


0.720 


0.582 


1.000 


0.121 


0.290 


0.150 


X10 


0.067 


-0.076 


0.040 


0.072 


0.175 


0.174 


0.102 


0.132 


0.121 


1.000 


0.447 


0.487 


XII 


0.286 


0.108 


0.126 


0.199 


0.316 


0.342 


0.300 


0.313 


0.290 


0.447 


1.000 


0.398 


X12 


0.224 


0.092 


0.177 


0.186 


0.165 


0.107 


0.139 


0.184 


0.150 


0.487 


0.398 


1.000 
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Table 3 

Total Variance Explained for Real Data 



Component 


Total 


% of 

Variance 


Cumulative 

% 


Extraction Sums 
of Squared 
Loadings Total 


% of 
variance 


Cumulative 

% 


1 


2.945 


32.727 


32.727 


2.945 


32.727 


32.727 


2 


1.760 


19.559 


52.285 


1.760 


19.559 


52.285 


3 


1.396 


15.515 


67.800 


1.396 


15.515 


67.800 


4 


0.717 


7.962 


75.762 








5 


0.629 


6.991 


82.753 








6 


0.535 


5.946 


88.699 








7 


0.478 


5.311 


94.010 








8 


0.289 


3.206 


97.215 








9 


0.251 


2.785 


100.000 









Extraction Method: Principal Component Analysis 
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Table 4 

Component Matrix 





1 


Component 

2 


3 


Paragraph Comprehension Test 


.776 


-.449 


.068 


Sentence Completion test 


.749 


-.518 


-.029 


Word Meaning Test 


.762 


-.461 


.025 


Speeded Addition Test 


.481 


.508 


-.361 


Speeded Counting of Dots in Shape 


.472 


.471 


-.506 


Speeded Discrim Straight and Curved Caps 


.534 


.315 


-.378 


Memory of Target Words 


.423 


.258 


.639 


Memory of Target Numbers 


.265 


.427 


.609 


Memory of Object-Number Association Targets 


.461 


.500 


. 285 



Extraction Method: Principal Component Analysis. 
A 3 components extracted. 
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Table 5 

Communalities 





Initial 


Extraction 


Paragraph Comprehension Test 


1.000 


0.809 


Sentence Completion Test 


1.000 


0.830 


Word Meaning Test 


1.000 


0.794 


Speeded Addition Test 


1.000 


0.619 


Speeded Counting of Dots in Shape 


1.000 


0.700 


Speeded Discrim Straight and Curved Caps 


1.000 


0.527 


Memory of Target Words 


1.000 


0.655 


Memory of Target Numbers 


1.000 


0.623 


Memory of Object-Number Association Targets 


1.000 


0.544 



Extraction Method: principal Component Analysis. 
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Table 6 



Total Variance Explained for Random Data 



Component 


Total 


% of Variance 


Cumulative % 


1 


1.317 


14.637 


14.637 


2 


1.274 


14.159 


28.796 


3 


1.038 


11.530 


40.325 


4 


1.021 


11.344 


51.669 


5 


0.975 


10.834 


62.503 


6 


0.945 


10.503 


73.005 


7 


0.876 


9.738 


82.743 


8 


0.812 


9.020 


91.764 


9 


0.741 


8.236 


100.00 



Extraction Method: Principal Component Analysis 
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Figure 1 



Scree Plot 




Component Number 
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